Ranking Techniques for Cluster Based Search Results in a Textual Knowledge-base

نویسندگان

  • Shefali Sharma
  • Sofus A. Macskassy
چکیده

This paper presents a framework and methodology to improve the search experience in digital library systems. The approach taken is to cluster a textual knowledgebase along multiple relations and return search results in the form of small, focused clusters. Specifically, we generate multiple relationship networks, one per relationship type, and then cluster these networks. At search time, we present a ranked set of clusters—one ranking per relationship type. The intuition for this approach is that returning clusters of contextually related information provides users with a situational and contextual awareness of the search results rather than returning a ranked list of only those documents that match the query. We address the use of both implicit (such as textual content) and explicit (such as citations, authors etc.) relations between documents. The primary question we focus on is how to rank the clusters, given a search query. We explore two approaches: a text-based rank (using the text‘s similarity to the user‘s query) and a social network-based rank (using information centrality). A comparison of these two ranking methods suggest that using information centrality for ranking is very useful for ranking clusters and its documents because the documents that characterize that cluster get the highest rank.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Identifying and Ranking the Important Textual and Paratextual Elements in Fiction Retrieval

Purpose: The purpose of this study is to identify the textual and paratextual elements in retrieving fiction from the readers’ perspective in order to provide the most appropriate access points for the readers and to improve access to fictions based on the readers’ needs. Method: The current research is an applied study in terms of purpose, applying a mixed method that was conducted using the ...

متن کامل

FRDC's Cross-lingual Entity Linking System at TAC 2013

In this paper, we present FRDC's system at participating in the cross-lingual entity linking (CLEL) tasks for the NIST Text Analysis Conference (TAC) Knowledge Base Population (KBP2013) track. We propose a joint approach for mention expansion, disambiguation, and clustering. In particular, we adopt a lexicon and rule based method for entity classification, a collaborative acronym expansion meth...

متن کامل

Personalizing the Search for Knowledge

Recent work on building semantic search engines has given rise to large graph-based knowledge repositories and facilities for querying them and more importantly, ranking the results. While the ranking provided may prove to be acceptable in general, for a truly satisfactory search experience, it is necessary to tailor the results according to the user’s interest. In this paper, we address the is...

متن کامل

Cross-Document Co-Reference Resolution using Sample-Based Clustering with Knowledge Enrichment

Identifying and linking named entities across information sources is the basis of knowledge acquisition and at the heart of Web search, recommendations, and analytics. An important problem in this context is cross-document coreference resolution (CCR): computing equivalence classes of textual mentions denoting the same entity, within and across documents. Prior methods employ ranking, clusterin...

متن کامل

HULTECH at the NTCIR-10 INTENT-2 Task: Discovering User Intents through Search Results Clustering

In this paper, we describe our participation in the Subtopic Mining subtasks of the NTCIR-10 Intent-2 task, for the English language. For this subtask, we experiment a state-ofthe-art algorithm for search results clustering, the HISGKmeans algorithm and define the users’ intents based on the cluster labels following a general framework. From the Web snippets returned for a given query, our fram...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009